Skip to content

traefik_open_connections metric drifts down until negative #10733

Closed
@navaati

Description

@navaati

Welcome!

  • Yes, I've searched similar issues on GitHub and didn't find any.
  • Yes, I've searched similar issues on the Traefik community forum and didn't find any.

What did you do?

I configured Prometheus monitoring on a traefik instance with --metrics.prometheus=true on a separate entrypoint (for a separate port) with no other Prometheus specific configuration, so I got the default set of metrics. I can query the /metrics allright and I would expect traefik_open_connections to stay at 5 (as that’s my load, 5 clients from a load testing tool).

What did you see instead?

The value of the traefik_open_connections metric goes down over time, until it even gets negative, which doesn’t make sense for a connection count (how can you have -1 open connections…).

See this excerpt from the /metrics:

# HELP traefik_open_connections How many open connections exist, by entryPoint and protocol
# TYPE traefik_open_connections gauge
traefik_open_connections{entrypoint="metrics",protocol="TCP"} -1
traefik_open_connections{entrypoint="traefik",protocol="TCP"} 0
traefik_open_connections{entrypoint="web",protocol="TCP"} -1

and this screenshot from Prometheus:
image

In the screenshot, the metric is supposed to stay at a constant 5 connections.

As a proof that there is actually still 5 connections, here is the output of ss (where we also see the connections to the backend service):

State           Recv-Q           Send-Q                           Local Address:Port                                 Peer Address:Port           Process
ESTAB           0                0                                   172.18.0.3:42488                                  172.18.0.2:8000            users:(("traefik",pid=2506,fd=19))
ESTAB           0                0                                   172.18.0.3:32842                                  172.18.0.2:8000            users:(("traefik",pid=2506,fd=18))
ESTAB           0                0                                   172.18.0.3:38010                                  172.18.0.2:8000            users:(("traefik",pid=2506,fd=17))
ESTAB           0                0                                   172.18.0.3:36600                                  172.18.0.2:8000            users:(("traefik",pid=2506,fd=16))
ESTAB           0                0                                   172.18.0.3:35230                                  172.18.0.2:8000            users:(("traefik",pid=2506,fd=23))
ESTAB           0                0                          [::ffff:172.18.0.3]:80                        [::ffff:217.182.237.24]:57220           users:(("traefik",pid=2506,fd=12))
ESTAB           0                0                          [::ffff:172.18.0.3]:80                        [::ffff:217.182.237.24]:57206           users:(("traefik",pid=2506,fd=10))
ESTAB           0                0                          [::ffff:172.18.0.3]:80                        [::ffff:217.182.237.24]:57250           users:(("traefik",pid=2506,fd=14))
ESTAB           0                0                          [::ffff:172.18.0.3]:80                        [::ffff:217.182.237.24]:57234           users:(("traefik",pid=2506,fd=13))
ESTAB           0                0                          [::ffff:172.18.0.3]:80                        [::ffff:217.182.237.24]:57222           users:(("traefik",pid=2506,fd=11))
ESTAB           0                0                          [::ffff:172.18.0.3]:8089                      [::ffff:217.182.237.24]:34364           users:(("traefik",pid=2506,fd=20))

What version of Traefik are you using?

Version:      3.0.0
Codename:     beaufort
Go version:   go1.22.2
Built:        2024-04-29T14:25:59Z
OS/Arch:      linux/amd64

What is your environment & configuration?

      --entrypoints.web.address=:80
      --entryPoints.web.asDefault=true
      --providers.docker
      --providers.docker.exposedbydefault=false
      --api.dashboard=true
      --api.insecure=true
      --entryPoints.metrics.address=:8089
      --metrics.prometheus=true
      --metrics.prometheus.entryPoint=metrics

The labels on the backend container discovered by the docker provider:

     - "traefik.enable=true"
     - "traefik.http.routers.app.rule=PathPrefix(`/`)"
     - "traefik.http.services.app.loadbalancer.server.port=8000"

The load on the service is a constant 5 clients repeatedly requesting the service, generated by locust (a python load testing tool). The backend is a python app that can handle one client at once and takes around 110ms for one request, so with a load of 5 clients it handles around 9 req/s with a latency of 550ms.

If applicable, please paste the log output in DEBUG level

No response

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions